72 research outputs found
Hyperprior Induced Unsupervised Disentanglement of Latent Representations
We address the problem of unsupervised disentanglement of latent
representations learnt via deep generative models. In contrast to current
approaches that operate on the evidence lower bound (ELBO), we argue that
statistical independence in the latent space of VAEs can be enforced in a
principled hierarchical Bayesian manner. To this effect, we augment the
standard VAE with an inverse-Wishart (IW) prior on the covariance matrix of the
latent code. By tuning the IW parameters, we are able to encourage (or
discourage) independence in the learnt latent dimensions. Extensive
experimental results on a range of datasets (2DShapes, 3DChairs, 3DFaces and
CelebA) show our approach to outperform the -VAE and is competitive with
the state-of-the-art FactorVAE. Our approach achieves significantly better
disentanglement and reconstruction on a new dataset (CorrelatedEllipses) which
introduces correlations between the factors of variation.Comment: AAAI-201
Probable Object Location (POLo) Score Estimation for Efficient Object Goal Navigation
To advance the field of autonomous robotics, particularly in object search
tasks within unexplored environments, we introduce a novel framework centered
around the Probable Object Location (POLo) score. Utilizing a 3D object
probability map, the POLo score allows the agent to make data-driven decisions
for efficient object search. We further enhance the framework's practicality by
introducing POLoNet, a neural network trained to approximate the
computationally intensive POLo score. Our approach addresses critical
limitations of both end-to-end reinforcement learning methods, which suffer
from memory decay over long-horizon tasks, and traditional map-based methods
that neglect visibility constraints. Our experiments, involving the first phase
of the OVMM 2023 challenge, demonstrate that an agent equipped with POLoNet
significantly outperforms a range of baseline methods, including end-to-end RL
techniques and prior map-based strategies. To provide a comprehensive
evaluation, we introduce new performance metrics that offer insights into the
efficiency and effectiveness of various agents in object goal navigation.Comment: Under revie
Factorized Inference in Deep Markov Models for Incomplete Multimodal Time Series
Integrating deep learning with latent state space models has the potential to
yield temporal models that are powerful, yet tractable and interpretable.
Unfortunately, current models are not designed to handle missing data or
multiple data modalities, which are both prevalent in real-world data. In this
work, we introduce a factorized inference method for Multimodal Deep Markov
Models (MDMMs), allowing us to filter and smooth in the presence of missing
data, while also performing uncertainty-aware multimodal fusion. We derive this
method by factorizing the posterior p(z|x) for non-linear state space models,
and develop a variational backward-forward algorithm for inference. Because our
method handles incompleteness over both time and modalities, it is capable of
interpolation, extrapolation, conditional generation, label prediction, and
weakly supervised learning of multimodal time series. We demonstrate these
capabilities on both synthetic and real-world multimodal data under high levels
of data deletion. Our method performs well even with more than 50% missing
data, and outperforms existing deep approaches to inference in latent time
series.Comment: 8 pages, 4 figures, accepted to AAAI 2020, code available at:
https://github.com/ztangent/multimodal-dm
- …